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Abstract 

Consider a linear stochastic system whose initial state is a random vector with a specified Gaussian distribution. 
Such a distribution may represent a collection of particles abiding by the specified system dynamics. In recent 
publications, we have shown that, provided the system is controllable, it is always possible to steer the state 
covariance to any specified terminal Gaussian distribution using state feedback. The purpose of the present work is 
to show that, in the case where only partial state observation is available, a necessary and sufficient condition for 
being able to steer the system to a specified terminal Gaussian distribution for the state vector is that the terminal 
state covariance be greater (in the positive-definite sense) than the error covariance of a corresponding Kalman filter. 


Keywords: Linear stochastic systems, stochastic control, covariance control, Kalman filter. 


I. Introduction 

The classical paradigm of stochastic optimal control is to regulate the response of system in such a way so 
as to minimize a prescribed performance index. The quality of the response in terms of tracking reference signals 
and/or reaching a prescribed destination point is encoded in the performance index which penalizes deviation from 
desirable mean-value response. The effect of state uncertainty at the starting point, and of stochastic disturbances 
and measurement noise, is that sample paths of state and output processes incur a certain amount of spread. The 
role of the performance index is precisely to limit this spread, indirectly, as a result of the optimal strategy that 
keeps the cost low and hence curtails deviation from the desired mean response. 

The viewpoint presented here is based on recent work by the authors 0, 0 and departs substantially from this 
classical recipe and aims to specify directly the spread of the state-vector. Thus, in this work, we first considered the 
question of whether specific state-distributions are attainable over a finite or infinite time-interval through (noise- 
free) state feedback. In the present, we address for the first time the control of the state-distribution, over a finite 
or infinite interval, via output feedback in the presence of measurement noise. It turns out that, our ability to steer 
the state-distribution, as compared to what is possible by noise-free state feedback, is only limited by an inequality 
of admissible state-covariances to exceed the error covariance of a corresponding Kalman filter. 

For motivation and background we refer to 0, 0 as well as to the largely expository paper 0 which has 
also been submitted in these proceedings (CDC 2015). 


II. Finite horizon steering 


Consider the linear time-invariant (LTI) system 



dx(t) 

= Ax(t)dt + Bu(t)dt + B\dw{t) 

(la) 

where 

dy(t) 

= Cx(t)dt + Ddv(t ) 

(lb) 


(A, B, B i, C, D) g R nxn x R nxm x R nxmi x R pxn x R pxp , 
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(A, B) is controllable, (A, C) is observable, D is invertible, and x. y. u. w. v represent the state, output, control 
input, process noise, and measurement noise respectively. We assume that w and v are both standard Wiener 
processes and independent of each other. We assume that at t = 0 the state, ,i(Q), is Gaussian with mean equal to 
zero and covariance So > 0, i.e., having probability density 

Po(x) = (27r) -n/2 det(S 0 ) _1/2 exp l xj . (2) 

The assumption on having zero-mean is made only for simplicity of the exposition and can be easily removed. 

Our goal is to steer the state distribution using dynamic output feedback to a “target” Gaussian end-point 
distribution 

Pt{x) = (27r) -n / 2 det(S'r) -1 / 2 exp ^—^x'Ff}x\ , (3) 

for the state vector, with Ft a given symmetric and positive definite n x n matrix. 


A. Feasibility conditions 

We first determine conditions on the terminal state covariance S t that permit the existence of control that 
steers the stochastic system to the corresponding end-point state density <|3]>. 

Consider state estimates provided by the Kalman filter 

dx(t) = Ax(t)dt + Bu(t)dt + L{t){dy — Cxdt ), (4) 

with (optimal) gain 

L(t) = P(t)C' (DD 1 )- 1 

and P(t) the state error-covariance obtained by solving the differential Riccati equation 

P(t) = AP(t) + P{t)A' + SiSi - P(f)C'(Z3i2 / ) _1 CP(f) (5) 

with initial condition P(0) = So- As usual, we denote by x(t) = x(t) — x(t) the estimation error, which satisfies 

dx(t) = {A — L[t)C)x(t)dt + B\dw{t ) — L{t)Ddv[t) 
and is orthogonal to x(t), i.e., K(x(t)x(t)') = 0. It follows that 

£(T) := E (x(T)x(T)') (6) 

= E((x(T) + x(T))(x(T)' + x(T)')) 

= E(x(T)x(r)') + E(x(T)x(T)') 

= P(T) + E(x(T)x(T)') > P(T). 

Therefore, 

E t > P(T) (7) 

is a necessary condition for a terminal state covariance to be “reachable” through suitable steering of the system 
dynamics. Our first result states that the strict inequality is Ft > P(T ) is in fact sufficient. This relies on (2] 
Theorem 3] which establishes a “controllability” result for a matrix differential Lyapunov equation. We note that 
the above argument on the necessity of Q does not assume any particular form for the functional dependence of 
the control input u on the output y. Yet, in the proof of the theorem below it is seen that, under the slightly stronger 
condition Ft > P(T), a control input of the form u = —Kx is sufficient to ensure the terminal distribution of 
x(T). 

Theorem 1: Given the stochastic linear system ([l) with distribution for the state vector at t = 0 specified by 
and given 

F t > P(T) ( 8 ) 
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where P{t) satisfies the Riccati equation ([5]) with initial condition P(0) = So, there exists a control process u(t), 
adapted to the output process y(t), such that the distribution of the state vector at t = T is given by the density in 


Proof: Consider a control u = —K{t)x(t) where x(t) is the state of the Kalman filter and K(t) a time-varying 
gain matrix. The state covariance of the combined state + estimation error system 

dt 


dx 


' A- BK 

BI< 


X 

dx 


0 

A-LC 


X 


+ 


B\dw 

B\dw — LDdv 


satisfies the matrix differential Lyapunov equation 

A -BK BK 
0 A-LC 


' E 

P ' 


P 

P 




' E 

P ' 


P 

P 


+ 


+ 


E 

P 


P 

P 



' A- BK 

BK 


0 

A-LC 


BiB[ 


BiB[ 


B X B\ BiB[ + LDD'L' 


Denote by E := E — P the error covariance of the Kalman filter state. From the above it readily follows that 

E = (A - BK)t + t(A - BKY + LDD'L'. (9) 

Since E(f) = E(i) + P(t ) for all t, it suffices to steer E(f) to a terminal value E(T) = Ey — P(T) > 0 with a 
suitable choice of K(t) over [0, T\. 


The claim of the theorem now basically follows from El Theorem 3] which states that a differential Lyapunov 
equation 

Q = AQ + QA' + BU(t)' + U(t)B' 

is controllable, i.e., Q(f) can be steered by a proper choice of U(t) between any two conditions at t = 0 and t = T, 
if and only if the system ([!]) is controllable, and moreover, the path Qit) for t e [0, T] can remain within the cone 
of positive definite matrices provided the boundary conditions Q(0) and Q(T) are. The con'esponding value for 
K(t ) is U(t)'Q(t) The only technical issue we need to address is that the conditions in |[2[ Theorem 3] require 
that the initial covariance be positive definite, while here, the initial condition for Q is S(0) = 0. To this end, we 
consider the choice K = 0 over a short window of time, [0, e). As we explain below, the conditions of the theorem 
will be fulfilled at t = e, and thence we apply El Theorem 3]. 


We claim that the solution to ([9]) with I\ = 0, namely 

± = A± + ±A'+ LDD'L' withS(0) = 0, (10) 

satisfies that L(t) >0 for any t > 0. To see that this is true, first note that E := E — P where P satisfies (Q and 
E satisfies 

E = AT, + EA/ + BiB[. (11) 

Hence, both E(-) and Pi-) are continuously differentiable functions with the same initial value E(0) = -P(O) = Eo 
and therefore, E(e) = E(e) — P(e) is of order O(e) for small e > 0. Rewrite and (|5]) as 

ET 1 = -A'E" 1 - E -1 A - E-^i^E" 1 , 

P- 1 = —A'P~ l - P~ l A - P^BxB^P- 1 + 0 / (L>0 , )" 1 0. 

It follows that 

e A '\P{t)~ l - E(f)“ 1 )e At = [ e A ' T C\DD')- 1 Ce AT dT 

Jo 

— f M(r)dr 


o 


( 12 ) 
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where 


M (r) = e A ' T (P(r)- 1 J BiS / 1 P(r)- 1 

- S(r) _1 Bi^S(r) _1 ) e^ T 


is differentiable and satisfies M(0) = 
()(t 2 ) for small f while the first term. 


0. Thus, the second term on the right hand side of equation (12 > is of order 


e A ' T C\DD')~ 1 Ce AT dT 


is of order 0(f). Moreover, this first term is strictly positive for any t > 0 since (A. C) is observable. Therefore, 

e At (P(t)~ 1 - A(t)~ 1 )e M > 0 

for f > 0 and small enough. We conclude that A(t) > P(t), that is A(t) > 0, for small f. The fact that S(t) > 0 
for larger t as well readily follows from (fTO)). ■ 


B. Sufficient conditions for optimality 


So far, we have established that provided At > P(T ) as in Theorem [I] it is possible to steer 0 from the 
initial probability density po to the “target” final probability density f>r- Further, it is easy to also see from the 
constructive proof of (21 Theorem 3] that steering can be effected with a finite energy control. Thus, from now 
on, we denote by U the (non-empty) family of admissible control processes u, that is, processes that are adaptecQ 
to the output process, have finite energy, and effect the steering of ([!]) from po to pj. With the benefit of having 
resolved the controllability question, we now focus on minimum-energy steering, namely the following. 


Problem 2: Determine u* that minimizes 

J(u) := E 


u(t)'u(t) dt 


< oo, 


over all u £ U, i.e., over adapted inputs that steer the system from state-covariance Eq to At- 


(13) 


Since At is the covariance of x(T) which is already specified, Problem [2] is equivalent to minimizing 

J(u) = E | u(t)'u(t ) dt j + E{x(T) , n(T)x(T)} 

over all u £U. This observation allows us to identify the form of an optimal control strategy. The reasoning is as 
follows. Without the terminal constraint on u to meet the end-point state density, it is standard that an y-adaptcd, 
finite-energy control, minimizing J is of the form 

u{t) = (14) 

where x(t) is the Kalman estimation of x(t) and 11(f) satisfies the Riccati equation 

ri(f) = -A(t)'U(t) - U{t)A{t) + U(t)B{t)B(t)'U{t) (15) 

with boundary value n(T) at f = T (see e.g., 0). Therefore, provided a suitable choice of n(T) can be found so 
that the controlled state x u * (T) at t = T has covariance At- this control strategy © is the solution to Problem 
|2] We summarize this conclusion as follows. 


Theorem 3: Let II(-) and P(-) be solutions of the Riccati differential equations satisfy ( fl5| ) and Q, respectively, 
and let A(-) satisfy 


E(t) = (A-BB'U(t))A(t)+ A{t)(A-BB'U(t))' 
+L(t)DD'L{t)' 


'depend on t and on {y u (s) \ 0 < s < t} for each t £ [0, T] 
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with boundary conditions n(T), P( 0) = So> and E(0) = 0. If S(T) = — P(T), then the control law ( |~j~4| > 

solves Problem [2j 

Theorem [3] provides a sufficient condition for a control u £ U to be a solution of Problem [2j The boundary 
condition II(T) in the statement of the theorem is not specified by the data of the problem; it only needs to be a 
symmetric matrix and not necessarily positive semi-definite. Thus, the statement of the theorem suggests a shooting 
method to iterate on the correspondence II(T) S(T) as an approach for obtaining optimal control laws. In 
general, an optimal solution may not exist and thus, this type of a computation approach that Theorem [3] naturally 
lends itself to, requires a detail investigation. 

However, since for any Sy > P(T), Problem [2] is always feasible and therefore, suboptimal solutions exists 
(with cost arbitrarily close to inf„ £ yj(ii)). A possible computational approach to construct such suboptimal 
solutions is given next. 


C. Numerical optimization scheme 


We follow steps that are analogous to our recent work (21 on minimum-energy steering via state-feedback. 
Herein, we consider the control-energy functional 

J(u) = E | J (K(t)x(t)y(K(t)x(t))dt 

trace(/i (f)(E(f) — P(t))K(t) r )dt 

trace(A' (t)£(t)K (t )')dt 

to be minimized over K(t ) so that <[9|) holds as well as the boundary conditions 

E(0) = 0, and E(T) = S(T) - P(T). (16) 

Let U(t) = — £(t)K(t)'. The objective function become^] 

J(u ) = f trace(C/(f) / E(i)^ 1 C/(f))df, 

Jo 

which is jointly convex in (/(•) and E(-). The constraint Q also becomes linear in U, namely, 



£ = At+ ±A' + LDD'L' + BU' + UB'. 


(17) 


Optimizing J(u) can now be recast as the semi-definite program to minimize 


subject to <[T6|)-(fT7|) and 


trace(y (t))dt 


Y(t ) u(ty 

U(t) E(t). 


This can now be solved numerically via discretization in time and space. A (suboptimal) control feedback gain 
K(-) then can be recovered by K(t) = — U(t)'£(t)~ l . 


III. Infinite horizon steering 

We now consider the stationary counterpart of our problem to ensure a terminal state-distribution by output 
feedback. 


2 Note that, as indicated earlier, E(t) > 0 for t £ (0,T], 
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A. Feasibility and characterization of stationary statistics 


Consider the stationary Kalman filter 

dx(t) = Ax(t)dt + Bu(t ) + L(dy — Cxdt). 

As usual, the Kalman gain is L = PC'{DD ') _1 where P is the covariance of the estimation error x(t) = x(t) —x(t) 
and satisfies the Algebraic Riccati Equation (ARE) 

AP + PA' + B 1 B[ - PC'(DD')~ 1 CP = 0. (18) 

It is a direct consequence of optimality of the Kalman filter, just as in (|6j) for the finite interval case, that for any 
linear (dynamical and causal) control scheme that ensures stationarity, the covariance £ of the state vector must 
satisfy 

£ > P. (19) 


For any such input, define S ux := K{u(t)x(t)'} (= S' xu ). Standard Ito calculus gives 

d(x(t)x(t)') = (Ax(t)x(t)' + x(t)x(t)'A' + BiB[)dt 
+ ( Bu(t)x(t)' + x(t)u(t)' B')dt 
+ B\dw{t)x{t)' + x(t)dw(t)' B[. 


By taking the expectation we obtain 

0 =A£ + £A' + BiB[ + BS UX + S' UX B', 
and therefore, for any feasible state covariance £, 

A£ + £A' + B^ + BX' + XB' = 0 (20a) 

can be solved for X. 


Condition ( |20a| ) can be equivalently expressed as: 

"A£ + £A' + B\B\ 


rank 


B 


B 

0 


rank 


0 

B 


B 

0 


(20b) 


and ensures that A£ + £A' + B\B[ is in the range of the linear map X ^ BX' + XB', cf. |J5] Proposition 1]. 
See also |6] for an alternative but equivalent condition in terms of A£ + BA' + B i B\ belonging to the kernel of 
a suitable operator. We summarize our conclusion as follows. 


Theorem 4: If a positive-definite matrix £ > P can be assigned as the stationary state covariance of 0 via a 
suitable choice of feedback control, then £ satisfies any of the equivalent statements (|20a 20b I. 


We next discuss the converse direction. In this we explain that the equivalent conditions ( |20j ) together with 
£ > P are almost sufficient for £ to be a stationary state covariance, in the sense that a covariance matrix arbitrarily 
close to £ is admissible. Moreover, we show that this can be achieved by output feedback that is implemented by 
a Kalman filter and control u(t) = —Kx(t). 


We begin by considering the joint dynamics of the system state and estimation error x{t) 


dx 


'A-BK BI< 


X 

dx 


0 A-LC 


X 


+ 


B\dw 

B\dw — LDdv 


The steady-state state covariance of this system satisfies the algebraic Lyapunov equation 


0 


+ 


A-BK 

0 

~ £ P 
P P 



BK 


' £ P ' 


A-LC 


P P 



'A-BK 

BI\ 



0 


A-LC 


PiP' B X B' X 

BiB[ BiB[ + LDD'L' 


x(t) — x(t), namely, 


( 21 ) 
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It follows that 

AT + TA' + BiB[ - BK(T - P) - (X - P)K'B' = 0, (22a) 

and 

(A - BK)(T — P) + (T — P)(A — BI <)' + LDD'L’ = 0. (22b) 

and therefore, T satisfies ( |20a[ ) for K = —X'{T — P) 1 . 


Provided A — BK is a Hurwitz matrix, T is an admissible stationary covariance. However, in general, A — BK 
may fail to be Hurwitz because of imaginary eigenvalues. In this case there is a “nearby” admissible stationary 
state-covariance. This can be shown by adapting a similar argument that was used for the case of state-feedback 
in 121 Remark 5]. Briefly, let T = T — P > 0, and consider the control 


K e = I< + -eB’T~ l 
2 


for e > 0. Then, from ( 22b[ ), 


(A - BK e )T + t(A - BK e )' = —eBB' - LDD'L' 

< -eBB'. 


(23a) 


The fact that A — BK e is Hurwitz is now obvious. Let T e be the solution to 

(A - BK t )(T e -P) + (E e - P)(A - BKJ = -LDD'L'. (23b) 

Then, the difference A = T — T e > 0 satisfies 

(A - BK e )A + A (A - BK e )' = -eBB', 

and hence is of order O(e). Thus, the algebraic condition that T satisfies ( |20| ) and the positivity constraint T > P 
together, are in effect sufficient in this approximate sense that we just explained since T e is an admissible state 
covariance. 


B. Conditions for optimality 

In general, since there may be more than one solution, we focus on one that minimizes the expected input 
power (energy rate) 

^power(w) := Ejr/ri}. (24) 

Thus, assuming feasibility for a specified state covariance T we consider the following problem. 

Problem 5: Determine u* that minimize^] J power (u) over all u(t) = —Kx(t ) such 

p{x) = (27r) -n / 2 det(£) -1//2 exp ^x'T~ l x^ (25) 

is the stationary distribution for the state vector. 

Problem [5] admits the following finite-dimensional reformulation. Let K be the set of all m x n matrices K 
such that the corresponding feedback matrix A — BI\ is Hurwitz. Since T{xx' } = T, 

E{u'u} = E {x'K'Kx} = trac e(KTK'), 

and Problem [5] reduces to finding a K G /C which minimizes 

J{K) = trace (kTK’^ 

3 Or, equivalently, a u that minimizes limr-s-oo ■{ / () T u(t)'u(t)dt >. 


( 26 ) 







( 27 ) 


subject to the constraint ( |22a[ ). Now, consider the Lagrangian 

C(K,U) = trace (KT,K'\ 

+ trace (n((A - BK)t + ±(A' - K'B') + LDD'L ')) . 

Note that since /C is open, a minimum point may fail to exist. Standard variational analysis leads to the form 
K = B '\I for the optimal gain. This analysis provides the following sufficient condition for optimality. 

Proposition 1: Assume that there exists a symmetric matrix II such that A — BB'Ii is a Hurwitz matrix and 

(A - BB'U)t + - BB'ny + LDD'L' = 0 (28) 

holds. Then, 

u*(t) = -B'Ux{t ) (29) 


is a solution to Problem [5] 


C. Minimum energy control 


In a similar manner as before, we next provide a numerical scheme to compute a solution to Problem [5] The 
average input power (energy rate) is 

E {u'u} = trac e(KT,K') (30) 

= trace(X , E“ 1 X) 


in either I \, or X. Thus, the optimal constant feedback gain K can be obtained by solving the convex optimization 
problem 


min |trace(2f / E 1 X) \ (20a > holds } . 


(31) 


After obtaining the optimal K from ( f3lj ), we need to check whether A — BK is Hurwitz or not. If not, the scheme 
in ( [23] ) can be applied to approximate the solution. 



Fig. 1: State trajectories in phase space 
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Fig. 2: Control input 


IV. Numerical example 


We consider motion of particles modeled by 

dxi(t.) = 
dx 2 (t) = 
dy(t) = 


x 2 {t)dt 

u{t)dt + dw(t ) 
x\(t)dt + 0.1dv(t) 


Here, u(t) represents a control input (force) at our disposal, x\ (t) represents position, x 2 {t) velocity, y(t) noisy 
(integral) position measurements, while dw(t) models random white-noise forcing and dv(t) represents measurement 
noise. Our goal is to steer the spread of the particles from an initial Gaussian distribution with So = I at t = 0 
to a terminal marginal £i = //att = l, and to maintain the particles’ distribution constant after t = 1. For the 
steering part on the interval [0, 1], Si is feasible since Si > P(l), where 


P( 1) 


0.0471 0.1049 
0.1049 0.4587 


is the estimation error of Kalman filter at t = 1. For the maintaining part, it is possible since Si satisfies ( 20a ) 
with X = [—1/2, —1/2]', and Si is greater than the solution of the Algebraic Riccati Equation (J_8), which is, 


0.0447 0.1000 
0.1000 0.4472 


Moreover, the corresponding feedback gain K = [5.4440, 19.7854] makes A — BK be Hurwitz. 


We now implement the time-varying output feedback as explained in Section [TT] to steer the distribution of 
particles over the interval [0, 1] and from there on, we implement the stationary control consisting of the stationary 
Kalman filter and the above constant gain. Figure [Tl displays typical sample paths in phase space, as functions 
of time, over the time window [0, 3], and Figure 15] displays the corresponding (color coded) control signals, 

u{t) = —K(t)x{t). 


V. Concluding remarks 

We have addressed the problem of steering the state statistics of a linear stochastic system via output feedback. 
In this case, where only partial state observation is available, we have provided necessary and sufficient conditions 
for being able to specify a terminal Gaussian distribution for the state vector as well as a stationary Gaussian 
distribution. The paper builds on our recent work [[]]], |[2l where we studied the problem to steer state statistics 
via state feedback. The viewpoint presented herein differs from standard Linear Quadratic Regulator theory Q, 
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lf8ll in that the control objective is specified directly in terms of terminal or stationary distributions for the state 
vector. Applications of this viewpoint are envisioned in areas where a distribution rather than a set of values for the 
state vector is a natural specification, e.g., in quality control, industrial and manufacturing processes, as well as in 
thermally driven atomic force microscopy, the control of molecular motors, laser driven reactions, manipulation of 
macromolecules, and so on, see e.g. j9]|, iflOl . lUTl . lfl2l . llT3l . Future work is expected to focus on such applications 
based on this framework. 
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